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Abstract 


We prove two theorems saying that no distributed sys- 
tem in which processes coordinate using reliable reg- 
isters and f-resilient services can solve the consensus 
problem in the presence of f +1 undetectable process 
stopping failures. (A service is f-resilient if it is guar- 
anteed to operate as long as no more than f of the 
processes connected to it fail.) 

Our first theorem assumes that the given services 
are atomic objects, and allows any connection pat- 
tern between processes and services. In contrast, we 
show that it is possible to boost the resilience of sys- 
tems solving problems easier than consensus: the k- 
set consensus problem is solvable for 2k — 1 failures 
using 1-resilient consensus services. The first theorem 
and its proof generalize to the larger class of failure- 
oblivious services. 

Our second theorem allows the system to contain 
failure-aware services, such as failure detectors, in ad- 
dition to fatlure-oblivious services; however, it requires 
that each failure-aware service be connected to all pro- 
cesses. Thus, f +1 process failures overall can dis- 
able all the failure-aware services. In contrast, it is 
possible to boost the resilience of a system solving con- 
sensus if arbitrary patterns of connectivity are allowed 
between processes and failure-aware services: consen- 
sus is solvable for any number of failures using only 
1-restlient 2-process perfect failure detectors. 


1 Introduction 


We consider distributed systems consisting of asyn- 
chronously operating processes that coordinate using 
reliable multi-writer multi-reader registers and shared 
services. A service is a distributed computing mech- 
anism that interacts with distributed processes, ac- 
cepting invocations, performing internal computation 
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steps, and delivering responses. Examples of services 
include: 


e Shared atomic (linearizable) objects, defined by se- 
quential type specifications [11,14], for example, 
atomic read-modify-write, queue, counter, test&set, 
and compare&swap objects. The consensus problem 
can also be defined as an atomic object. 


e Concurrently-accessible data structures such as bal- 
anced trees. 


e Broadcast services such as totally ordered broad- 
cast [10]. 


e Failure detectors, which provide processes with 
hints about the failure of other processes [5].1 


Thus, our notion of a service is quite general. We de- 
fine three successively more general classes of service— 
atomic objects, failure-oblivious services, and general 
(possibly failure-aware) services—in Sections 2, 6, and 
7. We define our services to tolerate a certain number 
f of failures: a service is f-resilient if it is guaranteed 
to operate as long as no more than f of the processes 
connected to it fail. 

A fundamental, general question in distributed 
computing theory is: “What problems can be solved 
by distributed systems, with what levels of resilience, 
using services of given types and levels of resilience?” 
In this paper, we expose a basic limitation on the 
achievable resilience, namely, that the resilience of 
a system cannot be “boosted” above that of its ser- 
vices. More specifically, we prove two theorems saying 
that no distributed system in which processes coordi- 
nate using reliable registers and f-resilient services can 
solve the consensus problem in the presence of f + 1 
process stopping failures. 


1Our notion of service encompasses all failure detectors de- 
fined by Chandra et al. [4] with one exception: we exclude failure 
detectors that can guess the future. 


We focus on the consensus problem because it has 
been shown to be fundamental to the study of re- 
silience in distributed systems. For example, Herlihy 
has shown that consensus is universal [11]: an atomic 
object of any sequential type can be implemented in a 
wait-free manner (i.e., tolerating any number of fail- 
ures), using wait-free consensus objects. 

Our first main theorem, Theorem 1, assumes that 
the given services are atomic objects and allows any 
connection pattern between processes and services. 
The result is a strict generalization of the classical im- 
possibility result of Fischer et al. [8] for fault-tolerant 
consensus. Our simple, self-contained impossibility 
proof is based on a bivalence argument similar to the 
one in [8]. The proof involves showing that decisions 
can be made in a particular way, described by a hook 
pattern of executions. 

In contrast to the impossibility of boosting for con- 
sensus, we show that it is possible to boost the re- 
silience of systems solving problems easier than con- 
sensus. In particular, we show that the k-set consen- 
sus problem [6] is solvable for 2k — 1 failures using 
1-resilient consensus services. 

Theorem 1 and its proof assume that the given ser- 
vices are atomic objects; however, they extend to the 
larger class of failure-oblivious services. A failure- 
oblivious service generalizes an atomic object by al- 
lowing an invocation to trigger multiple processing 
steps instead of just one, and to trigger any num- 
ber of responses, at any endpoints. The service may 
also include background processing tasks, not related 
to any specific endpoint. The key constraint is that 
no step may depend on explicit knowledge of failure 
events. We define the class of failure-oblivious ser- 
vices, give examples (e.g., totally-ordered broadcast), 
and describe how Theorem 1 can be extended to such 
services. 

Our second main theorem, Theorem 11, addresses 
the case where the system may contain failure-aware 
services (e.g., failure detectors), in addition to failure- 
oblivious services and reliable registers. This result 
also says that boosting is impossible. However, it 
requires the additional assumption that each failure- 
aware service is connected to all processes; thus, f +1 
process failures overall can disable all the failure-aware 
services. The proof is an extension of the first proof, 
using the same “hook” construction. We also show 
that the stronger connectivity assumption is necessary, 
by demonstrating that it is possible to boost the re- 
silience of a system solving consensus if arbitrary con- 
nection patterns are allowed between processes and 
failure-aware services: specifically, consensus is solv- 
able for any number of failures using only 1-resilient 


2-process perfect failure detectors. 


Related work. Our Theorem 1, for atomic services, 
can be derived by carefully combining several earlier 
theorems, including Herlihy’s result on universality of 
consensus [11], and the result of Chandra et al. on 
f-resiliency vs. wait-freedom [3] (see Appendix A). 
However, this argument does not extend to prove im- 
possibility of boosting for failure-oblivious and failure- 
aware services. Moreover, some of the proofs upon 
which this alternative proof rests are themselves more 
complex than our direct proof. 

Theorem 1 appeared first in a technical report [1]. 
Subsequent impossibility results for atomic objects ap- 
peared in [9,15]. Our models for failure-oblivious ser- 
vices and general services are new. As far as we know, 
this is the first time a unified framework has been used 
to express atomic and non-atomic objects. Moreover, 
this is the first time boosting analysis has been per- 
formed for services more general than atomic objects. 


Organization. Section 2 presents definitions for the 
underlying model of concurrent computation and for 
atomic objects. Section 3 presents our model for a 
system whose services are atomic objects. Section 4 
presents the first impossibility result. Section 5 shows 
that boosting is possible for set consensus. Section 6 
defines failure-oblivious services, gives an example, 
and extends the first impossibility result to systems 
with failure-oblivious services. Section 7 defines gen- 
eral services, gives examples, and presents our sec- 
ond main impossibility result. Appendix A shows how 
Theorem 1 can be derived from results in [3,11] and 
why these arguments do not extend to services more 
general than atomic services. Appendix B provides 
the complete proofs for the extension of the first im- 
possibility result to failure-oblivious services. 


2 Mathematical Preliminaries 
2.1 Model of concurrent computation 


We use the I/O automaton model [18, chapter 8] 
as our underlying model for concurrent computation. 
We assume the terminology of [18, chapter 8]. An I/O 
automaton A is deterministic iff, for each task e of A, 
and each state s of A, there is at most one transition 
(s,a,s’) such that a € e. 

An execution a of A is fair iff for each task e of 
A: (1) if a is finite, then e is not enabled in the final 
state of a, and (2) if a is infinite, then a contains 
either infinitely many actions of e, or infinitely many 


occurrences of states in which e is not enabled. A trace 
of A is a sequence of external actions of A obtained 
by removing the states and internal actions from an 
execution of A. A trace of a fair execution is called 
a fair trace. If a and a’ are execution fragments of 
A (with a finite) such that a’ starts in the last state 
of a, then the concatenation a- a’ is defined, and is 
called an extension of a. 


2.2 Sequential types 


We define the notion of a “sequential type”, in or- 
der to describe allowable sequential behavior of atomic 
services. The definition used here generalizes the one 
in [18, chapter 9]: here, we allow nondeterminism 
in the choice of the initial state and the next state. 
Namely, sequential type T = (V, Vo, inus, resps, 6) con- 
sists of: 


e V, a nonempty set of values, 

e Vo CV, anonempty set of initial values, 
e invus, a set of invocations, 

e resps, a set of responses, and 


e 6, a binary relation from inusx V to respsx V that is 
total, in the sense that, for every (a,v) € invus x V, 
there is at least one (b,v’) € resps x V such that 


((a,v), (b, 0’) € 6. 


We sometimes use dot notation, writing 
T.V,T.Vo,T.inus,... for the components of T. 
We say that 7 is deterministic if Vo is a single 
ton set {vo}, and 6 is a mapping, that is, for every 
(a,v) € inusx V, there is exactly one (b, v') € respsx V 
such that ((a,v), (b,v’)) € 6. 

We allow nondeterminism in our definition of a se- 
quential type in order to make our notion of “service” 
as general as possible. In particular, the problem of 
k-set-consensus can be specified using a nondetermin- 
istic sequential type. 

Example. Read/write sequential type: Here, V isa 
set of “values”, Vo = {vo}, where vo is a distinguished 
element of V, invs = {read} U {write(v) : v € V}, 
resps = V U {ack}, and 6 = {((read, v),(v,v)) : uv € 
V}U {((write(v), v’), (ack, v)) : v,v0' € Vf. 

Example. Binary consensus sequential type: Here, 
V = {{0},{1},0}, Vo = {0}, invs = {init(v)) : 
v € {0,1}}, resps = {decide(v) : v € {0,1}}, 
and 6 = {((init(v),@), (decide(v), {v})) : vu € VEU 
{((init(v), {v'}), (decide(v’), {v’})) : v,u' € V} 

Example. k-consensus sequential type: Now V is 
the set of subsets of {0,1,...,4} having at most k el- 
ements, Vo = {0}, invs = {init(v) : v € {0,1,...,&}}, 
resps = {decide(v) : v € {0,1,...,k}}, and 6 = 


{((init(v), W), (decide(v’), W U {v})) : |W] < kv’ € 
W U {v}} U {((init(v), W), (decide(v'),W)) : |W| = 
k,v' © W}. 

Thus, the first k values are remembered, and every 
operation returns one of these values. 


2.3 Canonical f-resilient atomic objects 


A “canonical f-resilient atomic object” describes 
the allowable concurrent behavior of atomic objects. 
Namely, we define the canonical f-resilient atomic ob- 
ject of type T for endpoint set J and index k, where 


e TJ is a sequential type, 


e J is a finite set of endpoints at which invocations 
and responses may occur, 


e f €N is the level of resilience, and 


e k is a unique index (name) for the service. 


The object is described as an I/O automaton, in Fig- 
ure l. 

The parameter J allows different objects to be con- 
nected to the same or different sets of processes. A 
process at endpoint 7 € J can issue any invocation 
specified by the underlying sequential type and can 
(potentially) receive any allowable response. We al- 
low concurrent (overlapping) operations, at the same 
or different endpoints. The object preserves the or- 
der of concurrent invocations at the same endpoint 7 
by keeping the invocations and responses in internal 
FIFO buffers, two per endpoint (one for invocations 
from the endpoint, the other for responses to the end- 
point). The object chooses the result of an operation 
nondeterministically, from the set of results allowed by 
the transition relation T.6 applied to the invocation 
and the current value of val. The object can exhibit 
nondeterminism due to nondeterminism of sequential 
type J, and due to interleavings of steps for different 
invocations. 

We model a failure at an endpoint 7 by an explicit 
input action fail; We use the task structure of I/O 
automata and the basic definition of fair executions to 
specify the required resilience: For every process 7 € J, 
we assume the service has two tasks, which we call the 
i-perform task and i-output task. The 7-perform task 
includes the perform, , action, which carries out op- 
erations invoked at endpoint i. The i-output task in- 
cludes all the b;,, actions giving responses at 7. In ad- 
dition, every i-* task (* is perform or output) contains 
a special dummy_*;,~ action, which is enabled when ei- 
ther process 7 has failed or more than f processes in 
J have failed. The dummy-_*;,, action is intended to 
allow, but not force, the service to stop performing 


CanonicalAtomicObject(T, J, f,k), 
where T = (V, Vo, invs, resps, 5) 


Signature: 

Inputs: 

Qik, @ € invs, i € J, the invocations at endpoint i 
fail,,1€ J 

Outputs: 

bin, 6 € resps, i € J, the responses at endpoint 7 


Internals: 
perform, .,%€ J 
dummy-_*i,k, * € {perform, output}, ie J 


State components: 

val € V, initially an element of Vo 

inv — buffer, a mapping from J to finite sequences of invs, 
initially identically empty 

resp — buffer, a mapping from J to finite sequences of resps 
initially identically empty 

failed C J, initially 0 


Transitions: 
Input: ai.~ 
Effect: 


add a to end of inv — buffer(i) 


Internal: perform; , 
Precondition: 
a = head(inv— buffer(i)) 
5((a, val), (b, v)) 
Effect: 
remove head of inv — buffer(i) 
val — v 


add b to end of resp— buffer(%) 


Output: bj, 
Precondition: 

b = head(resp— buffer(i)) 
Effect: 

remove head of resp — buffer(i) 


Input: fail, 
Effect: 
failed — failed U {i} 


Internal: dummy-*;,~ 
Precondition: 

i € failed V |failed| > f V failed = J 
Effect: 

none 


Tasks: 

For every i € J: 
i-perform: {perform, ;,, dummy_perform, , } 
i-output: {bi,~ :b € resps} U{dummy-_output, ;, } 


Figure 1: A canonical atomic object. 


steps on behalf of process 7 after 7 fails or after the 
resilience level has been exceeded. 

The definition of fairness for I/O automata says 
that each task must get infinitely many turns to take 
steps. In this context, this implies that, for every 
i € J, the object eventually responds to an outstand- 
ing invocation at 2, unless either 7 fails or more than 
f processes in J fail. If i does fail or more than f 
processes in J fail, the fairness definition allows the 
object to perform the dummy-*;,, action every time 


the 7 — * task gets a turn, which permits the object 
to avoid responding to 7. In particular, if more than 
f processes fail, the object may avoid responding to 
any process in J, since dummy-_ouput, ;, is enabled for 
alli € J. Also, if all processes connected to the ser- 
vice (i.e., all processes in J) fail, the object may avoid 
responding to any process. 

Thus, the basic fairness definition expresses the idea 
that the object is f-resilient: Once more than f of the 
processes connected to the object fail, the object itself 
may “fail” by becoming silent. However, although the 
object may stop responding, it never violates its safety 
guarantees, that is, it never returns values inconsistent 
with the underlying sequential type specification. 

A canonical atomic object whose sequential type is 
read/write is called a canonical register. In this paper, 
we will consider canonical reliable (wait-free) registers. 


2.4 f-resilient atomic objects 


An I/O automaton A is an f-resilient atomic object 
of type 7 for endpoint set J and index k, provided that 
it implements the canonical f-resilient atomic object 
S of type J for J and k, in the following sense: 


1. A and S have the same input actions (including 
fail actions) and the same output actions. 


2. Any trace of A is also a trace of S. (This implies 
that A guarantees atomicity.) 


3. Any fair trace of A is also a fair trace of S. (This 
says that A is f-resilient.) 


We say that A is wait-free (or, reliable), if it is (|J|—1)- 
resilient. This is equivalent to saying that (a) A is | J|- 
resilient, or (b) A is f-resilient for some f > |J| — 1, 
or (c) Ais f-resilient for every f > |J| —1. 


3 System Model with Atomic Objects 


Our system model consists of a collection of process 
automata, reliable registers, and fault-prone atomic 
objects (which we sometimes refer to as services). For 
this section, we fix I, K, and R, finite (disjoint) index 
sets for processes, services, and registers, respectively, 
and JT, a sequential type, representing the problem the 
system is intended to solve. A distributed system for 
I, K, R, and T is the composition of the following I/O 
automata (see [18, chapter 8]): 


1. Processes P;,1 € I, 


2. Services (atomic objects) Sy, k © K. We let Ty 
denote the sequential type, and J, C I the set of 
endpoints, of service S;,. We assume k itself is the 
index. 


3. Registers S,, r € R. We let V, denote the value 
set and vo,, the initial value for register S;. We 
assume r is the index. 


Processes interact only via services and registers. 
Process P; can invoke an operation on service S; pro- 
vided that 7 € Jz. Process P; can also invoke a read 
or write operation on register S;. provided that i € J,.. 
Services and registers do not communicate directly 
with one another, but may interact indirectly via pro- 
cesses. In the remainder of this section, we describe 
the components in more detail and define terminology 
needed for the results and proofs. 


3.1 Processes 


We assume that process P;, 7 € I has the following 
inputs and outputs: 


e Inputs a;, a € T.invs, and outputs b;, b € T.resps. 
These represent P,’s interactions with the external 
world. 


e For every service S;, such that 7 € Jy, outputs a;,x, 
a € T;,.invs, and inputs b;,, b € T,.resps. 


e For every register S,, outputs a;,,-, where a is a read 
or write invocation of S,, and inputs b;,,, where b is 
a response of S;.. 


e Input fazl,. 


P; may issue several invocations, on the same or 
different services or registers, without waiting for re- 
sponses to previous invocations. The external world 
at P; may also issue several invocations to P; without 
waiting for responses. As a technicality, we assume 
that when P; performs a decide(v), output action, it 
records the decision value v in a special state compo- 
nent. 

We assume that P; has only a single task, which 
therefore consists of all the locally-controlled actions of 
P;,. We assume that in every state, some action in that 
single task is enabled. We assume that the fail; input 
action affects P; in such a way that, from that point 
onward, no output actions are enabled. However, 
other locally-controlled actions may be enabled—in 
fact, by the restriction just above, some such action 
must be enabled. This action might be a “dummy” 
action, as in the canonical resilient atomic objects de- 
fined in Section 2.3. 


3.2 Services and registers 
We assume that service S, is the canonical f- 


resilient atomic object of type 7, for Jy and k. Like- 
wise, we assume that register S; is the canonical wait- 


free atomic read/write object with value set V, and 
initial value vo,,, for J; and r. 


3.3. The complete system 


The complete system C is constructed by composing 
the P;,S,, and S, automata and then hiding all the 
actions used to communicate among them. 

For any action a of C, we define the participants of 
action a to be the set of automata with a in their sig- 
nature. Note that no two distinct registers or services 
participate in the same action a, and similarly no two 
distinct processes participate in the same action. Fur- 
thermore, for any action a, the number of participants 
is at most 2. Thus, if an action a has two participants, 
they must be a process and either a service or register. 

As we defined earlier, each process P; has a sin- 
gle task, consisting of all the locally controlled actions 
of P;. Each service or register S., c € K UR, has 
two tasks for each i € J,: i-perform, consisting of 
{perform, ;,, dummy-_perform, ,,}, and i-output, con- 
sisting of {b;, : b © Ty-resps} U {dummy_output, ,}. 
These tasks define a partition of the set of all actions 
in the system, except for the inputs of the process au- 
tomata that are not outputs of any other automata, 
namely, the invocations by the external world and the 
fail; actions. The I/O automata fairness assumptions 
imply that each of these tasks get infinitely many turns 
to execute. 

We say that a task e is applicable to a finite execu- 
tion a iff some action of e is enabled in the last state 
of a. 


3.4 The consensus problem 


The “traditional” specification of f-resilient binary 
consensus is given in terms of a set {Pi,i € I} of 
processes, each of which starts with some value v; 
in {0,1}. Processes are subject to stopping failures, 
which prevent them from producing any further out- 
put.” As a result of engaging in a consensus algorithm, 
each nonfaulty process eventually “decides” on a value 
from {0,1}. The behavior of processes is required to 
satisfy the following conditions (see, e.g., [18, chapter 


6]): 


Agreement No two processes decide on different val- 
ues. 


Validity Any value decided on is the initial value of 
some process. 
2Stopping failures are usually defined as disabling the pro- 


cess from executing at all. However, the two definitions are 
equivalent with respect to overall system behavior. 


Termination In every fair execution in which at 
most f processes fail, all nonfaulty processes 
eventually decide. 


In this paper, we specify the consensus problem dif- 
ferently: We say that a distributed system S' solves 
f-resilient consensus for I if and only if S is an f- 
resilient atomic object of type consensus (Section 2.2) 
for endpoint set IT. We argue that any system that 
satisfies our definition satisfies a slight variant of the 
traditional one. In this variant, inputs arrive explicitly 
via init() actions, not all nonfaulty processes need re- 
ceive inputs, and only nonfaulty processes that do re- 
ceive inputs are guaranteed to eventually decide. Our 
agreement and validity conditions are the same as be- 
fore; our new termination condition is: 


Termination In every fair execution in which at 
most f processes fail, any nonfaulty process that 
receives an input eventually decides. 


4 Impossibility of Boosting for 
Atomic Objects 


Our first main theorem is: 


Theorem 1 Let n = |I| be the number of processes, 
and let f be an integer such that0 < f <n—1. There 
does not exist an (f +1)-resilient n-process implemen- 
tation of consensus from canonical f-resilient atomic 
objects and canonical reliable registers. 


To prove Theorem 1, we assume that such an im- 
plementation exists and derive a contradiction. Let 
C denote the complete system, that is, the composi- 
tion of the processes P;, i € I, services S;, k € Kk, 
and registers S,, r € R. By assumption, C satisfies 
the agreement, validity and termination properties of 
consensus. 

For each component c € K UR andi € Je (recall 
that J. denotes the endpoints of c) let inu—buffer(i)< 
denote the invocation buffer of c, which stores invoca- 
tions from P;, and let resp—buffer(z). denote the re- 
sponse buffer of c, which stores responses to P;. Also 
let buffer(i). = (inu— buffer (2)-, resp — buffer (z)-). 


4.1 Assumption 


To prove Theorem 1, we make the following as- 
sumption: 


(i) We assume that the processes P;, i € I, are deter- 
ministic automata, as defined in Section 2.1. For 
services, we assume a slightly weaker condition: 


that the sequential type is deterministic, i.e, the 
sequential type has a unique initial value and the 
transition relation 6 is a mapping. Note that the 
sequential type for registers is also deterministic, 
by definition. 


Assumption (i) implies that, after a finite failure- 
free execution a, an applicable task e determines a 
unique transition, arising from running task e from 
the final state s of a. We denote this transition as 
transition(e, s) (since it is uniquely defined by the fi- 
nal state s). If transition(e,s) = (s,a,s’), then we 
write first(e,s), action(e,s), and last(e,s) to denote 
s, a, and s’, respectively. We sometimes abbreviate 
last(e,s) as e(s). Note that, if s is the final state 
of a, then transition(e, s), first(e, s), action(e,s), and 
last(e,s) are defined iff e is applicable to a. 

Assumption (i) implies that any failure-free execu- 
tion can be defined by applying a sequence of tasks, 
one after the other, to the initial state of C. Assump- 
tion (i) does not reduce the generality of our impos- 
sibility result, because any candidate system could be 
restricted to satisfy (i); if the impossibility result holds 
for the restricted automaton, then it also holds for the 
original one. 


Lemma 2 Let a be any finite fatlure-free execution 
of C, e be any task of C applicable toa, anda: be 
any failure-free extension of a such that 3 includes no 
actions of e. Then e is applicable to a- (3. 


Proof: Task e is either a process task, service task, 
or register task. If e is a process task, then e is 
applicable to any finite execution, by our assumption 
that each process always has some enabled locally 
controlled action. If e is a service task, say of service 
S;, then applicability of e to @ means that service 
S;, has either a pending invocation in an inv— buffer 
or a pending response in a resp—buffer, after a. 
Since @ does not include any actions of e, and the 
invocation or response remains pending as long as e 
is not scheduled, e is also applicable after a- G. If e is 
a register task, the argument is similar. 


Let s be any state of C arising after a finite failure- 
free execution a of C, and let e be a task that is appli- 
cable to a (equivalently, enabled in s). Then we write 
participants(e,s) for the set of participants of action 
action(e, s). Note that, for any task e and any state s, 
|participants(e, s)| < 2. Also, if |participants(e, s)| = 
2, then participants(e,s) is of the form {P;,S.}, for 
somei€ IJ andce KUR. 


4.2 Initializations and valence 


In our proof, we consider executions in which con- 
sensus inputs arrive from the external world at the 
beginning of the execution. Thus, we define an ini- 
tialization of C to be a finite execution of C containing 
exactly one init(); action for each i € I, and no other 
actions. An execution a of C is input-first if it has an 
initialization as a prefix, and contains no other init() 
actions. A finite failure-free input-first execution a is 
defined to be 0-valent if (1) some failure-free extension 
of a contains a decide(0), action, for some i € J, and 
(2) no failure-free extension of a contains a decide(1),; 
action, for any 7 € I. The definition of a 1-valent ex- 
ecution is symmetric. A finite failure-free input-first 
execution a is univalent if it is either 0-valent or 1- 
valent. A finite failure-free input-first execution a is 
bivalent if (1) some failure-free extension of a contains 
a decide(0), action, for some i, and (2) some failure- 
free extension of a contains a decide(1), action, for 
some 7. These definitions immediately imply the fol- 
lowing result: 


Lemma 3 Every finite failure-free input-first execu- 
tion of C is either bivalent or univalent. 


The following lemma provides the first step of the 
impossibility proof: 


Lemma 4 C has a bivalent initialization. 


Proof: Write J = {1,...,n}. For each i € 
{0,...,n}, let a? be an initialization of C in which pro- 
cesses P,,...,P; receive initial value 1 and processes 
Pis4i,..-,Pn receive 0. By the validity property of C 
and Lemma 3, a° is 0-valent, a” is 1-valent, and every 
a (j € {0,...,n}) is either univalent or bivalent. 

Then there must be some index i € {0,...,n — 1} 
such that a’ is 0-valent and a’*! is either 1-valent 
or bivalent. The only difference between the initial- 
izations in a’ and a’t! is the initial value of P;. So 
consider a failure-free extension of a’ that is fair, ex- 
cept that P; takes no steps. Since this execution looks 
to the rest of the system like an execution in which 
P; has failed, the termination condition requires that 
the other processes must eventually decide. Since the 
execution is in fact failure-free and a’ is 0-valent, the 
decision must be 0. 

Now, an analogous failure-free extension may be 


constructed for a‘t!, also leading to a decision of 
, § 


0. Since, by assumption, a‘t! is either l-valent or 
bivalent, it must be bivalent. 


For the rest of this section, fix a, to be any partic- 
ular bivalent initialization of C. 


So (0-valent) 


8, (1-valent) 


Figure 2: A hook starting in a. 


4.3 The graph G(C) 


Now define an edge-labeled directed graph G(C) as 
follows: 


(1) The vertices of G(C) are the finite failure-free 
input-first extensions of the bivalent initialization 
Mp. 


(2) G(C) contains an edge labeled with task e from a 
to a’ provided that a’ = e(a). 


By assumption (i) of Section 4.1, any task triggers at 
most one transition after a failure-free execution a. 
Therefore, for any vertex a of G(C) and any task e, 
there is at most one edge labeled with e outgoing from 
Q. 


4.4 The existence of a hook 


We show that decisions in C can be made in a par- 
ticular way, described by a hook pattern of executions. 
Similarly to [4], we define a hook to be a subgraph of 
G(C) of the form depicted in Figure 2. 


Lemma 5 G(C) contains a hook. 


Proof: Starting from the bivalent vertex ay of G(C), 
we generate a path 7 in G(C) that passes through bi- 
valent vertices only, as follows. We consider all tasks 
in a round-robin fashion. Suppose we have reached a 
bivalent execution a so far, and task e is the next task 
in the round-robin list that is applicable to a. (We 
know such a task exists because the process tasks are 
always applicable.) 

Lemma 2 implies that, for any finite failure-free ex- 
tension a’ of a (such that e is not executed along the 


suffix of a’ starting in the last state of a) e is applicable 
to a’, and hence e(a’) is defined. We look for a vertex 
a’ of G(C), reachable from @ in G(C) without following 
any edge labeled with e, such that e(a’) is bivalent. If 
no such vertex a’ exists, the path construction termi- 
nates. Otherwise, we proceed to e(a’) and continue 
by processing the next task in the round-robin order. 
This construction is presented in Figure 3. Each com- 
pleted iteration of the loop extends the path by at 
least one edge. Let a be the path generated by this 
construction. 

First suppose that 7 is infinite. Then 7 corresponds 
to a fair failure-free input-first execution a of C. More- 
over, every finite input-first prefix of a is bivalent. 
Thus, no process can decide in a (for otherwise, the 
agreement property of C would be violated). This is a 
contradiction, so 7 must be finite. 


1: aca 
while true do 
3: Let e be the next task (in round-robin order) 
applicable to a 
4: if a has a descendant a’ in G(C) such that 
the path from a to a’ includes 
no e labels and e(a’) is bivalent then 
choose some such a’ 
a e(a’) 
else 


tS 


exit 


Figure 3: Hook location in G(C). 


Let a be the last vertex of 7. By construction, a 
is bivalent. Upon termination of the above path con- 
struction in vertex a, let e be the next task in round 
robin order that is applicable to a. Such an e al- 
ways exists since nonfaulty processes can always take 
a step, by assumption. Since the path construction 
terminated in a, we conclude that e satisfies the fol- 
lowing condition: for any descendant a’ of a, such 
that the path from a to a’ includes no e labels, e(a’) 
is univalent. 

Without loss of generality, assume that e(a) is 0- 
valent. Since a is bivalent, there is a descendant a’ of 
a@ such that e(a’) is 1-valent. Let o0,...,0m be the 
sequence of vertices of G(C) on the path from a to a’, 
and for each 7, 0 < j < m—1, let e; be the label of 
the edge on this path from o; to oj41. Thus, oj41 = 
ej(a;). By construction, e(o9) is 0-valent, e(om) is 


1-valent, and every e(o;), 7 € {1,...,m— 1}, is uni- 
valent. Thus, there exists an index j € {0,...,m—1} 
such that e(;) is 0-valent and e(o;+1) is 1-valent. 
As a result, we obtain a hook (Figure 2) with e in 
the hook equal to e in this proof, a = 0;, a! = 0441, 
ag = e(0;), 1 = e(o;41), and e’ = e;. 


4.5 Similarity 


In this section, we introduce notions of similarity 
between system states. These will be used in showing 
non-existence of a hook, which will yield the contra- 
diction needed for the impossibility proof. First, we 
define j-similar system states. 

Let 7 € I and let sg and s, be states of C. Then so 
and s; are j-similar if: 


(1) For every i € I — {j}, the state of P; is the same 
in So and sj. 


(2) For every ce K UR: 


1. The value of val, is the same in so and 8}. 


2. For every i € J. — {j}, the value of buffer(i). 
is the same in sg and 81. 


Lemma 6 Let j € I. Let ap and a, be finite failure- 
free input-first executions, sg and 81 the respective fi- 
nal states of ag and a ,. Suppose that spo and s, are 
j-similar. If ag and a, are univalent, then they have 
the same valence. 


Proof: We proceed by contradiction. Fix 7, ao, a1, 
89, and s, as in the hypotheses of the lemma, and 
suppose (without loss of generality) that ag is 0-valent 
and qa , is l-valent. Let J C I be any set of indices 
such that 7 € J and |J| = f +1. Since f <n—1 by 
assumption, we have |J| < n, and so J—J is nonempty. 

Consider a fair extension of ag, ao: 3, in which the 
first f + 1 actions of @ are fail;, 7 © J, and no other 
fail actions occur in @. Note that, for alli € J, 6 
contains no output actions of P;. Assume that in (, 
no perform,, or b;,- (ie., a response) action of any 
i-* task, i € J, occurs at any component c € KU R; 
we may assume this because, for each 2 € J, action 
fail, enables a dummy action in every i-* task of every 
service and register (* is perform or output). 

Since Qo is a failure-free input-first execution, the 
resulting extension ao - (3 is a fair input-first execution 
containing f + 1 failures. Therefore, the termination 
property for (f + 1)-resilient consensus implies that 
there is a finite prefix of ap - @, which we denote by 
ag: y, that includes decide(v), for some | ¢ J and 
v € {0,1}. Construct ao-7y', where ¥’ is obtained from 


y by removing the fad, action, all dummy actions, and 
any remaining internal actions of P;, i € J. Thus, 
ag: y' is a failure-free extension of ap that includes 
decide(v),. Since ag is 0-valent, v must be equal to 0. 

We claim that decide(0), occurs in the suffix 4’, 
rather than in the prefix a9. Suppose for contradic- 
tion that the decide(0), action occurs in the prefix ao. 
Then by our technical assumption about processes, the 
decision value 0 is recorded in the state of J. Since so 
and s; are j-similar and | 7, the same decision value 
O appears in the state s,. But this contradicts the as- 
sumption that a , which ends in s1, is 1-valent. So, it 
must be that the decide(0), occurs in the suffix y’. 

Now we show how to append essentially the same 
y’ after a1. We know that, for every i € J, y/ con- 
tains no locally controlled action of P;, and contains no 
perform, . or b;,- action (b € resps), for any c € KUR. 
By definition of j-similarity, we have: 


(a) For every i ¢ J, the state of P; is the same in 59 
and s1. 


(b) For every ce KUR, 


1. The value of val, is the same in so and s (that 
is, in the final states of ap and a). 

2. For every i € J. — J, the value of buffer(7)¢ is 
the same in so and s1. 


Thus: 


(c) If 7’ contains any locally controlled actions of a 
process i, then the state of P; is the same in so 
and 81. 


(d) For every ce KUR, 


1. The value of val, is the same in so and 51. 

2. For every i € Jc, if y’ contains any perform, . 
or bic (b € resps) actions of c, then the value 
of buffer(i)- is the same in so and 51. 


It follows that it is possible to append “essentially” 
the same 7 after a,, resulting in a failure-free 
extension of a; that includes decide(0),.2 But ay is 
1-valent — a contradiction. 


Similarly, we define the notion of k-similar states: 
Let k € K, and let s9 and s, be states of C. Then so 
and s; are k-similar if the following conditions hold: 


(1) For every i € I, the state of P; is the same in so 
and 81. 
3Really, we are appending another execution fragment 7’ 


after a, — one that looks the same to all the processes and 
service tasks that take steps in y’. 


(2) For every c € (K — {k}) UR, the state of S, is 
the same in sg and s}. 


Lemma 7 Letk € K. Let ag and a, be finite failure- 
free input-first executions, 89 and 81 the respective fi- 
nal states of ap and a ,. Suppose that so and s, are 
k-similar. If ag and a, are univalent, then they have 
the same valence. 


Proof: Fix k, ao, a1, 59, and s,; as in the hypotheses 
of the lmma. By contradiction, suppose (without loss 
of generality) that ao is 0-valent and a is 1-valent. 
Let J C I be any set of indices such that |J| = f+1, 
and, if |J,| < f +1, then J, C J, whereas if |Jz| > 
f+1, then J C Jp. 

Consider a fair extension of ag, ao: 3, in which the 
first f + 1 actions of @ are fail;, 7 © J, and no other 
fail actions occur in @. Note that, for alli € J, B 
contains no output actions of 7. Assume that in @, no 
perform, ,, or bj,~ action (b € resps) of S; occurs; we 
may assume this because the f+ 1 fail actions enable 
dummy actions in all tasks of Sx. 

Since Qo is a failure-free input-first execution, the 
resulting extension ao - 3 is a fair input-first execution 
containing f +1 fail actions. Therefore, the termi- 
nation property for f + 1-resilient consensus implies 
that there is a finite prefix of ao - 3, which we denote 
by ao - 7, that includes decide(v), for some 1 € I — J 
and v € {0,1}. We know that decide(0), occurs in the 
suffix 7, rather than in the prefix ag, by an argument 
similar to that in the proof of Lemma 6. 

Now construct ao -7’, where y’ is obtained from + 
by removing all the fail, actions, 1 € J, and all dummy 
actions. Thus, ao -7’ is a failure-free extension of ag 
that includes decide(v),. Since ao is 0-valent, v must 
be equal to 0. 

Now we show how to append essentially the same 
7 after a,. By definition of k-similarity, we have: 


(a) For every i € I, the state of P; is the same in so 
and Si. 


(b) For every c € (K — {k}) UR, the state of S, is the 
same in Sp and 81. 


Thus: 


(c) For every ce KU R, if 7’ contains any perform... 
or b;,- actions of S., then the state of S, is the same 
in So and s1, since c¥ k in this case. 


By properties (a) and (c), it follows that it is possible 
to append “essentially” the same 7’ after a1, (differing 
only in the state of S;) resulting in a failure-free 
extension of a; that includes decide(0),. But aj is 
1-valent — a contradiction. 


4.6 The non-existence of a hook 


Now we are ready to prove the absence of hooks. 
Lemma 8 G(C) contains no hooks. 


Proof: By contradiction. Assume that a hook exists, 
as depicted in Figure 2. Let s, s’, sg, and s, be the 
respective final states of a, a’, ao, and ay, and let e 
and e’ be the two tasks involved in the hook, as shown. 
Since ap and a, are 0-valent and 1-valent, respectively, 
by Lemmas 6 and 7, so and s; cannot be j-similar for 
any j € I, or k-similar for any k € K. In particular, 
we cannot have so = s;. Also, note that e’(ag) is 0- 
valent, since it is an extension of a 0-valent execution. 
Therefore, again, by Lemmas 6 and 7, e’(so) and 51 
cannot be 7-similar for any 7 € I, or k-similar for any 
k € K. In particular, we cannot have e’(s9) = s1. We 
establish the contradiction using a series of claims: 


Claim 1: e# e'. 
Suppose for contradiction that e = e’. Then by de- 
terminism (Assumption (i) in Section 4.1), we have 
ag = a’. However, ag is 0-valent, whereas a’ has a 
1-valent failure-free extension a; — a contradiction. 
Claim 1 and Lemma 2 imply that e’ is enabled from 


e(s). 


Claim 2: participants(e, s)N participants(e’, s) 4 0. 
Suppose for contradiction that participants(e,s) M 
participants(e’,s) = 9. Therefore, the two tasks com- 
mute, that is, e’(e(s)) = e(e’(s)). In other words, 
e'(so) = $1 — a contradiction. 


Since participants(e, 8) N participants(e’, s) # 0, ei- 
ther a process, service, or register must be in the inter- 
section. We prove three claims showing that none of 
these possibilities can hold, thus obtaining the needed 
contradiction. 


Claim 3: There does not exist 7 € J such that 
P, € participants(e, s) N participants(e’, s). 
Suppose for contradiction that P; € participants(e, s)M 
participants(e’,s). Then the two actions action(e, s) 
and action(e’,s) involve only P; and the buffers 
buffer(i)-, ce € K UR. Furthermore (since the same 
task e is used), the action action(e, s’) also involves 
only P; and the buffers buffer(t)c, c€ KUR. But 
then the states s9 and s; can differ only in the state 
of P; and in the values of buffer(i)., cE K UR. This 
implies that so and s; are 7-similar — a contradiction. 


Claim 4: There does not exist k € K such that 
Sk € participants(e, s)  participants(e’, s). 


Suppose for contradiction that S;, € 
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participants(e,s) M participants(e’, s). There are 


four possibilities: 


1. participants(e, s) = participants(e’, s) = {Sx}. 
Then e and e’ must be perform tasks of S,, and 
so involve only the state of S;. But then the 
states sg and s; can differ only in the state of S,. 
So sg and s; are k-similar — a contradiction. 


2. For some i € I, participants(e, s) = {S,, P;} and 
participants(e’, s) = {Sz}. 
Then the two tasks commute, that is, e’(so) = $1 
— a contradiction. 


3. For some i € I, participants(e’,s) = {S,, Pi} 
and participants(e, s) = {Sx}. 
Again, the two tasks commute, that is, e’(so) = 
8, — a contradiction. 


4. For some i,j € I, participants(e,s) = {Sx, P;} 
and participants(e’,s) = {Sx, Pj}. 
By Claim 3, we know that i 4 j. Then again, 
the two tasks commute, so e’(sq) = $1 — a con- 
tradiction. 


Note that for cases 2 and 3 above (but not case 
4), whenever action(e,s) and action(e’,s) access the 
same buffer, one action inserts an intem and the other 
removes an intem. Hence the actions commute. 


Claim 5: There does not exist r € R such that 
S, € participants(e, s) N partictpants(e’, s). 
Suppose for contradiction that 
participants(e,s) M participants(e’, s). 
four possibilities: 


Sp € 
There are 


1. participants(e, s) = participants(e’, s) = {S;,}. 
Then e and e’ must be perform tasks of reg- 
ister S,. Without loss of generality, suppose 
that action(e,s) is perform, and action(e’, s) 
is perform,,.. Since e # e’, we have i # j. We 
consider subcases based on whether the two op- 
erations performed are reads or writes: 


(a) action(e,s) and action(e’,s) both perform 
read operations. 
Then the two tasks commute, so e’(s9) = $1 
— a contradiction. 


action(e, s) performs a write operation. 
Then states so and s; can differ only 


in the value of inv—buffer(j), and 
resp —buffer(j)r: in $1, an invocation 
is missing from inv—buffer(j), and 


an extra response appears at the end 
of resp—buffer(j),, with respect to 
inv—buffer(j), and resp—obuffer(j), in 


So. SO Sg and s, are j-similar — a 
contradiction. 


action(e, s) performs a read operation and 
action(e’, s) performs write(v). 

Then e’(so) and s; differ only in the value 
of resp — buffer (2), (different read responses 
may be appended at the end). So e’(so) and 
81 are i-similar — a contradiction. 


2. For some i € I, participants(e, s) = {S,, P;} and 
participants(e’, s) = {S,}. 
Then the two tasks commute, so e’(sg) = $1 — 
a contradiction. 


3. For some i € I, participants(e’, s) = {S,, P;} and 
participants(e, s) = {S,}. 
Again, the two tasks commute, so e’(s9) = 81 — 
a contradiction. 


4. For some i,j € I, participants(e,s) = {S;, P;} 
and participants(e’,s) = {S,, P;}. 
By Claim 3, we know that 7 4 7. Then the two 
tasks commute, so e’(s9) = 81 — a contradic- 
tion. 


Now Claims 3, 4, and 5 together imply that 
participants(e, s) N participants(e’,s) = @. But this 
directly contradicts Claim 2. 


Lemma 5 contradicts Lemma 8. Hence we have 


derived a contradiction by assuming the negation of 


Theorem 1. Hence Theorem 1 is established. 


5 k-Set Consensus 


Our boosting impossibility result concerns consen- 
sus implementations. Interestingly, while it is not pos- 
sible to implement (f + 1)-resilient consensus using 
registers and f-resilient atomic objects, this is not the 
case for the k-set consensus problem [6]. In k-set con- 
sensus, the processes have to agree on at most k differ- 
ent values (k-set consensus reduces to consensus when 
k=1). 

Consider a set of f-resilient k-set consensus ser- 
vices, each one exporting m ports. An algorithm that 
implements f’-resilient k’-set consensus works as fol- 
lows. Take a principal subset of the processes, and 
divide it into s disjoint groups, each one accessing a 
different service. Each principal process participates 
in an execution proposing its input value to its des- 
ignated service. When it gets a decision back, the 
process decides on the value and writes it in a shared 
register. The remaining processes simply wait until 
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at least one principal process writes the value. The 
values of k’ and f’ depend on the size of the princi- 
pal set, and on the number s of services we divide it 
into. There is a tradeoff between k’ and f’: if a small 
number of failures f’ is tolerated, then a high degree 
of agreement is achieved, namely a small k’. If more 
failures f’ must be tolerated, then a lower degree of 
agreement is achieved, namely a large k’. 

To achieve correctness, we must ensure first that at 
least one principal process receives a decision from its 
service and communicates the decision to all, i.e., (1) 
every f-resilient service is connected to f+1 processes, 
and (2) fewer than s-(f +1) principal processes can 
fail: f’ < s-(f+1). Thus, there is at least one service 
S that is not killed, and moreover, there is at least one 
correct principal process that receives a decision value 
from S and writes the decision in a shared register. 
Thus, every correct process eventually decides. The 
number of possible different decision values is at most 
s-k: there are at most k different values returned per 
service; more precisely, at most k values per service 
being accessed by at least k processes, and c values 
for a service that is being accessed by c processes for 
c<k. Thus, for a desired overall resilience f’, we want 
the smallest possible k’ and so we find the smallest 
integer s that guarantees f’ < s-(f +1). Thus, we 
have s = [(f’ +1)/(f +1)] services, and take the first 
f’ +1 processes to be the principal processes (f’ + 1 
processes using as few services as possible, each one 
with f +1 input ports). It follows that 


Theorem 9 For anyl<k<m,k< f<m—-1,1< 
f' <n-1, it ts possible to implement f'-resilient k'- 
set consensus using read-write memory and f-resilient 
k-set consensus services, each one with m ports, for 


pel 
f+l 


When each available service is wait-free, that is f = 
m — 1, this algorithm reduces to the one of [12], and 
gives a tight bound. As an example, assume that we 
want to implement a f’-resilient k’-set consensus in a 
system of 2c processes, where f’ = 2c — 1, using only 
1-resilient consensus services, i.e., f = 1,k =1. The 
smallest k’ for which we can do this is k’ = c, using 
s = c services, each shared by 2 processes (f’ +1 = 2c 
principal processes). 

Note that the algorithm above uses services that 
are not connected to all processes. It is known that 
f-vesilient f-set consensus cannot be solved using only 
reliable registers [2, 13,19]. We conjecture that f- 
resilient f-set consensus cannot be solved using only 
reliable registers and services that are connected to all 
processes. 


ki >k- | +min(k, (f’ + 1)mod(f + 1)). 


6 Failure-Oblivious Services 


A failure-oblivious service is a generalization of an 
atomic object. It allows an invocation to trigger mul- 
tiple processing steps instead of just one perform step. 
These steps can interleave with processing steps trig- 
gered by other invocations, and this makes a failure- 
oblivious service non-atomic, in general. A failure- 
oblivious service also allows an invocation to trigger 
any number of responses, at any endpoints, instead 
of just a single response at the endpoint of the in- 
vocation. The service may also include background 
processing tasks, not related to any specific endpoint. 
The key constraint is that no step may depend on ex- 
plicit knowledge of failure events. In this section, we 
define the class of failure-oblivious services, give ex- 
amples, and describe how Theorem 1 can be extended 
to such services. 


6.1  f-resilient failure-oblivious services 


As for atomic objects, we begin by defining a canon- 
ical f-resilient failure-oblivious service. A canoni- 
cal f-resilient failure-oblivious service is parameter- 
ized by J, f, and k, which have the same mean- 
ings as for canonical atomic objects. Also, in place 
of the sequential type parameter 7, the service 
has a service type parameter U, which is a tuple 
(V, Vo, inus, resps, glob, 61, 62,63), where V and Vo are 
as before, invs and resps are the respective sets of in- 
vocations and responses (which can occur at any end- 
point), glob is a set of global tasks, and 61, 62,63 are 
three transition relations. 

Here, 6, is a total binary relation from inus x J x V 
to (the set of mappings from J to finite sequences of 
resps) XV. It is used to map an invocation at the head 
of a particular inv— buffer, and the current value for 
val, to a set of results, each of which consists of a 
new value for val and sequences of responses to be 
added to any or all of the resp— buffers. 52 is a total 
binary relation from J x V to (the set of mappings 
from J to finite sequences of resps) xV. It is used to 
map a particular endpoint and value of val to a set of 
results, defined as above. Finally, d3 is a total binary 
relation from V to (the set of mappings from J to finite 
sequences of resps) XV. It it used to map a value of val 
to a set of results. The code for a canonical failure- 
oblivious automaton, showing how these parameters 
are used, appears in Figure 4. 

Thus, a canonical f-resilient failure-oblivious ser- 
vice is allowed to perform rather flexible kinds of pro- 
cessing, both related and unrelated to individual end- 
points, as long as processing decisions do not depend 
on knowledge of occurrence of failure events. 
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An I/O automaton A is an f-resilient failure- 
oblivious service of type U, endpoint set J, and in- 
dex k, provided that it implements the canonical f- 
resilient failure oblivious service S of type Y/ for J and 
k, in the same sense as for atomic objects. 


6.2 Example: Totally Ordered Broadcast 


We describe an f-resilient totally ordered broadcast 
service for a particular message alphabet M, endpoint 
set J and index k, as a special case of an f-resilient 
failure-oblivious service for J and k. To do this, we 
need only specify the failure-oblivious service type U = 
(V, Vo, inus, resps, glob, 61, 62,63). Here, V consists of a 
single msgs queue, containing messages that have been 
totally ordered, together with their sources (Figure 5). 
Vo indicates that this queue is initially empty. 

The invocation set invs is {bcast(m) : m € M}. 
The response set resps is {rcu(m,i) : m € M,i € 
J}. (rcv(m, 7) indicates the receipt of message m from 
sender i. This receipt can occur at any endpoint.) 
glob consists of one task named g, that is, glob = {g}. 
61, the relation describing the transitions that process 
invocations from inv — buffers, is defined in Figure 6: 

This code processes the first element of 
inv—buffer(«) by adding it to the end of the se- 
quence stored in msgs. (Formally, 61((a, 7%, v), (B, v’)) 
holds iff a = bcast(m), v’.msgs is the result of adding 
(m, 7) to the end of v.msgs, and B(j) is empty for all 
j) 

62 is the identity relation, indicating that no other 
processing is done on behalf of 7. Relation 63 is defined 
in Figure 7: 

(Formally, 53(v,(B,v’)) holds iff either (a) v.msgs 
is nonempty, (m,i) = head(v.msgs), v'.msgs = 
tail(v.msgs), and for every 7 € J, B(j) is the se 
quence consisting of the single element rcv(m,i), or 
(b) v.msgs is empty, v’ = v, and for every j, B(j) is 
the empty sequence. ) 


6.3. Impossibility of Boosting 


Let index set K include now the indices of all 
failure-oblivious services. Now the notion of k- 
similarity restricts the states of all registers and of 
all atomic and failure-oblivious services except Sz. 

We now argue that Lemmas 2-8 extend to this case. 

Lemma 2: We have added the i-compute and g- 
compute tasks to the definition of a service, Figure 4. 
These are defined using total transition relations 52 
and 63. Since these are total relations, we see from 
Figure 4 that these tasks are always enabled. Hence 
Lemma 2 still holds. 


CanonicalFailureObliviousService(U, J, f,k), 
where U = (V, Vo, inus, resps, glob, 51, 52, 63) 


Signature: 

Inputs: 

Qiks @ E invs, te J 
fail,, iE J 
Outputs: 

bik, DE resps,i€ J 


Internals: 

perform; ,,t€ J 

compute; ,,t€ J 

dummy-_*i,~, * € {perform, compute, output}, i € J 
compute, ,, 9 € glob 

dummy-_compute, ,, 9 € glob 


State components: 
As for canonical atomic object. 


Transitions: 
Input: a;,4 
As for canonical atomic object. 


Internal: perform; , 
Precondition: 
a = head(inv— buffer(i)) 
61((a, i, val), (B, v)) 
Effect: 
remove head of inv — buffer(i) 
val — v 
for 7 € J do 
add B(j) to end of resp— buffer(j) 


Internal: compute; ,,i€ J 
Precondition: 
62((i, val), (B,v)) 
Effect: 
val — v 
for 7 € J do 
add B(j) to end of resp— buffer(j) 


Internal: compute, ;,, 9 € glob 
Precondition: 
63 (val, (B, v)) 
Effect: 
val <—v 
for 7 € J do 
add B(j) to end of resp— buffer(j) 


Output: 63.4 
As for canonical atomic object. 


Input: fail; 
As for canonical atomic object. 


Internal: dummy-_*i,~, 1 € J 
As for canonical atomic object. 


Internal: dummy-compute, ;,, g € glob 
Precondition: 

|failed| > f 
Effect: 

none 


Tasks: 

For every i € J: 
i-perform: {perform,; ,, dummy-perform, j,} 
i-compute: {compute, ,,, dummy-compute; ;, } 
i-output: {bin 2b € resps} U {dummy-_output ; ,} 

For every g € glob: 


g-compute: {compute dummy-compute, ;, } 


gsk? 


Figure 4: A canonical failure-oblivious service. 
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Components of val: 
msgs, a finite sequence of items in M x J, initially empty 


Figure 5: The composition of val in a totally ordered 
broadcast service. 


Internal: perform, ;, 
Precondition: 

send(m) = head(inv — buffer(i)) 
Effect: 

remove head of inv — buffer(i) 

add (m,i) to msgs 


Figure 6: Relation 6) in a totally ordered broadcast ser- 
vice. 


Lemmas 3-5: The proofs of these lemmas do not 
depend on the definition of a service, and so they carry 
over. 

Lemma 6: The proof carries over by replacing ev- 
ery reference to perform, ;, actions with a reference to 
perform; , Or compute, ;, or compute, , actions. We 
provide a complete proof in Appendix B. 

Lemma 7: Since service S; is “silent” along y, the 
change in its definition does not affect the proof. The 
other services have the same behavior along y and y’, 
and the original proof of Lemma 7 does not refer to 
their detailed definition. Hence this proof carries over. 

Lemma 8: Claims 1, 2, 3, and 5 carry over with 
no difference in the proof, since their proof does not 
refer to the definition of actions of services. For 
claim 4, the proof of case 1 (participants(e,s) = 
participants(e’,s) = {5S,}) must be modified by re- 
placing every reference to i — perform tasks with a 
reference to i— perform or 1— compute or g — compute 
tasks. The proofs of the other cases carry over. Hence 
the lemma as a whole carries over. We provide a com- 
plete proof in Appendix B. 

Hence the following result: 


Theorem 10 Let f and n be integers,0< f<n-1. 
There does not exist an (f +1)-resilient n-process im- 
plementation of consensus from canonical f-resilient 
atomic services, canonical f-resilient failure-oblivious 
services, and canonical reliable registers. 


7 General (Failure-Aware) Services 


A general, or failure-aware service is a further gen- 
eralization of a failure-oblivious service. This time, 
the generalization removes the failure-oblivious con- 
straint, allowing the service’s decisions to depend on 
knowledge of failures of processes connected to the ser- 
vice. 


Internal: compute, ;, 
Precondition: 
true 
Effect: 
if (m,i) = head(msgs) then 
remove head of msgs 
for each j € J: 
add rcev(m, 1) to resp— buffer(j) 


Figure 7: Relation 63 in a totally ordered broadcast ser- 
vice. 


7.1 f-resilient general services 


A canonical f-resilient general service is param- 
eterized by J, f, and k, which have the same 
meanings as for canonical failure-oblivious services, 
and by a service type parameter UW, which is a tu- 
ple (V, Vo, inus, resps, glob, 61, 62,63), as for failure- 
oblivious services. This time, however, the domains 
of 5;, d2, and 63 are invsx Jx Vx 2/7, JxV x 2!, 
and V x 2/, respectively. The final argument, in each 
case, will be instantiated in the service code with the 
current failed set. 

The only portions of the code that are different from 
those for failure-oblivious services are the three transi- 
tion definitions that use the 61, 62, and 63 (Figure 8). 


Internal: perform; , 
Precondition: 
a = head(inv— buffer(i)) 
61((a, i, val, failed), (B,v)) 
Effect: 
remove head of inv — buffer(i) 
val — v 
for 7 € J do 
add B(j) to end of resp— buffer(j) 


Internal: compute; ,,1¢€ J 
Precondition: 
62((i, val, failed), (B,v)) 
Effect: 
val — v 
for 7 € J do 
add B(j) to end of resp— buffer(j) 


Internal: compute, ;,, 9 € glob 
Precondition: 
63 ((val, failed), (B,v)) 
Effect: 
val — v 
for 7 € J do 
add B(j) to end of resp— buffer(j) 


Figure 8: Relations 61, 62 and 63 in a general service. 


An I/O automaton A is an f-resilient general ser- 
vice of type U, endpoint set J, and index k, provided 
that it implements the canonical f-resilient general 
service S of type U for J and k, in the same sense 
as for atomic and failure-oblivious services. 
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7.2 Examples: Failure detectors 


In this section, we describe how a variety of well- 
known failure detectors [4,5] can be modeled as general 
services. Our failure detectors do not provide all the 
functionality of the standard model [4]: because our 
failure detectors are automata, they cannot predict 
future input actions. Thus, our services encompass 
only realistic failure detectors [7]. 

All of our failure detector services have empty inus 
sets, that is, their only inputs are fail, actions. 


7.2.1 Perfect Failure Detector P 


First, we define an f-resilient perfect failure detec- 
tor for J and k. V contains only one (trivial) state, 
that is, the service maintains no internal information 
other than the failed set. Responses are of the form 
suspect(J’), J’ C J. The set glob of global tasks is 
empty. Since there are no invocations, 6, is trivial. 
Since there are no global tasks, 63 is empty. All that 
remains is to define 62, which describes computation 
on behalf of each process i: d2(i, failed) simply puts a 
suspect response containing the current failed set into 
i’s response buffer (Figure 9). 


Internal: compute; ;, 
Precondition: 

true 
Effect: 


add suspect(failed) to resp — buffer(i) 


Figure 9: Relation 62 in P. 


7.2.2 Eventually Perfect Failure Detector OP 


Again, responses are of the form suspect(J’), J’ © J. 
We model eventual perfection using a mode variable, 
which can take on values perfect or imperfect. Initially, 
and after each new failure, mode is set to imperfect. A 
background task is responsible for eventually switch- 
ing mode to perfect. Since failures must eventually 
stop, the mode eventually remains perfect. While in 
perfect mode, the failure detector suspects exactly the 
processes that have failed. In imperfect mode, suspi- 
cions are arbitrary. The set of internal state compo- 
nents in OP is presented in Figure 10. 


Components of val: 
mode € { perfect, imperfect}, initially imperfect 
oldfailed C J, initially 0 


Figure 10: The composition of val in OP. 


The global task set glob = {91,92}. Task gi is re- 
sponsible for setting mode to imperfect while task go 
sets it to perfect. The interesting transition definitions 
are presented in Figure 11. 


Internal: compute; , 
Precondition: 

true 
Effect: 


if mode = perfect then 
add suspect(failed) to resp — buffer(i) 
else 
choose J’ where J’ C J 
add suspect(J’) to resp — buffer(i) 
Internal: compute, p 
Precondition: 


true 
Effect: 
if failed £ oldfailed then 
mode := imperfect 


oldfailed := failed 


Internal: computes, p 


Precondition: 
true 
Effect: 
if failed = oldfailed then 
mode := perfect 


Figure 11: Internal transitions in OP. 


7.2.3. Eventual Leader Service ( 


The eventual leader service 2 provides leader(l) re- 
sponses at all nodes, where / € J. Eventually (assum- 
ing that not all processes fail), the latest leader an- 
nouncements should be identical at all endpoints, and 
should indicate the name of a non-failed endpoint. We 
again model eventual perfection using a mode variable 
(Figure 12). 


Components of val: 
mode € {perfect, imperfect}, initially imperfect 
oldfailed C J, initially 0 
leader € JU {1}, initially L 


Figure 12: The composition of val in Q. 


We again use two global tasks gi,g2. Now gi: sets 
mode to imperfect and removes any choice of leader, 
while g2 sets mode to perfect and chooses a leader. 
The corresponding transition definitions are presented 
in Figure 13. 


7.3 Impossibility of Boosting 
Our impossibility results for atomic and failure- 


oblivious services allow arbitrary connections between 
processes and services. However, it turns out that we 
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Internal: compute; ;, 
Precondition: 
true 
Effect: 
if mode = perfect then 
add leader(leader) to resp — buffer(i) 
else 
choose j € J 
add leader(j) to resp — buffer(i) 
Internal: computes, p 
Precondition: 
true 
Effect: 
if failed £ oldfailed then 
leader := L 
mode := imperfect 
oldfailed := failed 
Internal: compute,, ;, 
Precondition: 


true 
Effect: 
if failed = oldfatiled \ leader £ L then 
leader := choose | where | € J — failed 
mode := perfect 


Figure 13: Internal transitions in 2. 


can boost the resilience of systems containing failure- 
aware services, if we allow arbitrary connection pat- 
terns: 

For example, consider a system that uses wait-free 
registers and 1-resilient perfect failure detectors. Sup- 
pose that every pair of processes shares a 1-resilient 
2-process failure detector. Such a system can imple- 
ment a wait-free perfect failure detector for all pro- 
cesses as follows: Process 7 just listens to all failure 
detectors it is connected to and accumulates the set 
of suspected processes in a dedicated register. Period- 
ically, it outputs its set of suspected processes. Since 
every perfect failure detector is 1-resilient, the algo- 
rithm is wait-free. Using this construction, f-resilient 
consensus, for any f, can be implemented using wait- 
free registers and 1-resilient services. 

This boosting is, however, impossible if we assume a 
system in which f-resilient failure-aware services must 
be connected to all processes, thus, f + 1 process 
failures overall can disable all the failure-aware ser- 
vices. We assume that the system may also contain 
f-vesilient failure-oblivious services, connected to ar- 
bitrary processes. By applying arguments similar to 
ones presented in Section 4, we can prove boosting 
to be impossible, ie., that (f + 1)-resilient consensus 
cannot be solved in such a model. 

The proof is also based on analysis of a “hook”. In 
fact, we need to introduce only slight modifications 
into the proofs of Lemmas 6 and 7: Let ao and a, 
be any two univalent failure-free input-first executions 
whose respective final states, sg and s,, are j-similar 


(respectively, k-similar). Assume, by contradiction, 
that a@p and a, have opposite valences. The defini- 
tions of j-similarity and k-similarity do not restrict the 
states of failure-aware services, that is, failure-aware 
services can have arbitrary states in so and s1, the 
respective final states of ao and ay. 

However, note that the f+1 failures of processes in 
J allow every failure-aware service to stop performing 
(non-dummy) locally controlled steps. Then following 
the arguments of Lemmas 6 and 7, we can construct a 
failure-free extension of ao, ag’, such that (1) y’ in- 
cludes decide(v);, for some 1 € I—J; (2) 7’ includes no 
locally controlled step of process Pj, nor any perform, 
compute;, or output, step for any service or register 
(respectively, 7’ includes no locally controlled step of 
service S;); (3) 7’ includes no locally controlled step 
of any failure-aware service. Thus, 7’ is essentially ap- 
plicable to a, — a contradiction with the assumption 
that ao and a; have opposite valences. 

We first note that Lemmas 2-5 carry over to the 
case of general services. The argument for this is iden- 
tical to that for failure-oblivious services, given in Sec- 
tion 6.3. 

For Lemma 6: The proof for the case of failure 
oblivious services already handles both atomic and 
failure oblivious services. To handle f-resilient gen- 
eral services, we note that we can assume that all 
of these servies are “silent” along y, since the occur- 
rence of f +1 fail, actions enables a dummy action in 
every task of every general service. Thus the differ- 
ent definition for actions perform, ;,, compute; ;, and 
compute, ,, in particular, their ability to observe the 
set of failed processes, makes no difference. Hence 7’ 
can be appended after a; in the same way as in the 
proof for the case of failure oblivious services. 

For Lemma 7: Since the service S; can be “si- 
lenced” as before, the proof is unchanged from that 
for failure oblivious services. 

For Lemma 8: We defined the hook so that it does 
not contain any fail; actions. Hence at all states in 
the hook, the set failed of failed processes is empty. 
Thus the different definition for actions perform, ,, 
compute; ;, and compute, ,, in particular, their abil- 
ity to observe the set of failed processes, makes no 
difference. Hence the proof is unchanged from that 
for failure oblivious services. 

Hence the following result: 
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Theorem 11 Let f and n be integers, 0 < f<n-1. 
There does not exist an (f +1)-resilient n-process im- 
plementation of consensus from canonical f-resilient 
general services connected to all processes, canonical 
f-resilient atomic services (connected to arbitrary pro- 
cesses), canonical f-resilient failure-oblivious services 
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(connected to arbitrary processes), and canonical reli- 
able registers. 


8 Conclusions 


We have established the impossibility of boosting 
the resilience of services in a distributed asynchronous 
system where processes are subject to undetectable 
stopping failures. Our results can be viewed as a gen- 
eralization to any number f of failures of the impos- 
sibility result of Fischer, Lynch and Paterson [8] for 
f =1. While our first result (for atomic objects) can 
be derived from existing results in the literature, the 
direct proof that we give is simpler, and is also easily 
extended to more general services than atomic objects. 
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Appendix A Alternative proof for 
atomic services 


In this section, we show how our result for the 
case of atomic objects can be derived from earlier re- 
sults [3,11,16,17]. This alternative proof of our re- 
sult was obtained independently and concurrently by 
Jayanti [15] and Guerraoui and Kouznetsov [9]. How- 
ever, this alternative proof does not extend to more 
general services. 


A.1 The proof 


The following two lemmas are restatements in our 
terminology of the “necessity” part and the “suffi- 
ciency” part of Theorem 4.1 in [3], respectively. 
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Lemma 12 Let f and n be integers, 0 < f, 1 <n. 
Then there exists an f-resilient n-process implemen- 
tation of consensus from wait-free (f +1)-process con- 
sensus objects and reliable registers.* 


Lemma 13 Let f and n be integers, 2 < f <n. Then 
there exists a watt-free (f +1)-process implementation 
of consensus from f-resilient n-process consensus ob- 
jects and reliable registers. 


The following result follows easily from Herlihy’s 
universal construction [11]: 


Lemma 14 Let f and n be integers, 0 < f, 1 <n. 
Let T be a sequential type. Then there exists an f- 
resilient n-process implementation of an atomic object 
of type T from f-resilient n-process consensus objects 
and reliable registers. 


The following result is shown in [16]. 


Lemma 15 Let n be integer, n > 0. There does not 
exist a wait-free (n+1)-process implementation of con- 
sensus from wait-free n-process consensus objects and 
reliable registers. 


Theorem 1 Let f and n be integers, O< f <n—- 
1. There does not exist an (f + 1)-resilient n-process 
implementation of consensus from f-resilient atomic 
objects and reliable registers. 


Proof: By contradiction, assume that there ex- 
ists an (f + 1)-resilient n-process implementation of 
consensus from f-resilient atomic objects and reliable 
registers. We consider two cases. 

First suppose that f = 0, son > 2. Thus, we 
have a 1-resilient n-process implementation of consen- 
sus using 0-resilient atomic objects and reliable regis- 
ters. By Lemma 14, each 0-resilient atomic object used 
in this implementation can itself be implemented from 
0-resilient consensus objects and reliable registers. By 
substituting these implementations for the objects, we 
obtain a l-resilient n-process implementation of con- 
sensus using 0-resilient consensus objects and reliable 
registers. Now, a 0-resilient consensus object can be 
implemented from reliable registers, so substituting 


4Theorem 4.1 in [3] assumes 2 < f. However, the necessity 
part of the theorem requires only 0 < f. 

5 A 0-resilient consensus with an endpoint set J can be easily 
implemented from two reliable registers as follows. Every pro- 
cess participating in the consensus algorithm writes its input 
value in a dedicated “proposal” register R (initialized to 1). 
Then the process keeps reading a dedicated “decision” register 
D (initialized to L) until a non- value is read, in which case 
the process decides on this value. In parallel, a dedicated pro- 
cess P; (i € J) keeps reading R. As soon as P; reads a non-L 
value v in R, P; writes v in D. 


once more, we obtain a 1-resilient n-process imple- 
mentation of consensus using only reliable registers. 
But this contradicts the impossibility result of [17]. 
Now suppose that f > 1. By Lemma 14, each 
f-resilient atomic object used in this implementation 
can itself be implemented from f-resilient consensus 
objects and reliable registers. By substituting, we 
obtain an (f + 1)-resilient n-process implementation 
of consensus from f-resilient consensus objects and 
reliable registers. By Lemma 12, each f-resilient 
consensus object used in this implementation can be 
implemented from wait-free (f + 1)-process consensus 
objects and reliable registers. By substituting again, 
we obtain an (f + 1)-resilient n-process implemen- 
tation of consensus from wait-free (f + 1)-process 
consensus objects and reliable registers. Now by 
Lemma 13 (using the fact that 2 < f+1< n), a 
wait-free (f + 2)-process consensus object can be im- 
plemented from (f + 1)-resilient n-process consensus 
objects and reliable registers. By substituting, we ob- 
tain an implementation of a wait-free (f + 2)-process 
consensus object from wait-free (f + 1)-process 
consensus objects and reliable registers. But this 
contradicts Lemma 15. 


A.2 Extension to more general services 


The argument in the previous subsection does not 
extend to all services. Here we give two reasons for 
this. 

First, the universality result fails to hold for many 
distributed services. In particular, no meaningful fail- 
ure detector can be implemented from consensus ob- 
jects. Indeed, by definition, an atomic service does 
not provide any information about failures: the value 
of the service is not affected by failures of processes. 
Here we simply give an example, showing that consen- 
sus cannot implement a perfect failure detector. 

Indeed, assume, by contradiction, that there is an 
algorithm A that implements a perfect failure detector 
in a system of n processes using n-process consensus 
objects and registers. Consider any finite execution 
a of A in which process i is faulty and is declared 
to be faulty. Now we consider an execution a’ that is 
identical to a except that a’ includes no fail, event (7 is 
just slow to take steps in a’). Clearly, a’ is also a finite 
execution of A, since registers and consensus objects 
are failure-oblivious. Thus, in a’, a process is declared 
faulty without having failed— a contradiction. 

The second reason why the arguments of [3] do not 
work with non-atomic services is that, generally speak- 
ing, an f-resilient implementation of n-process con- 
sensus is not equivalent to a wait-free implementation 


18 


of (f + 1)-process consensus (Theorem 4.1 of [3]). In- 
deed, if f-resilient k-process consensus is implemented 
from non-atomic services, the simulation algorithm 
presented in the proof of Theorem 4.1 in [3] is not 
valid: a step of a process accessing a general service 
cannot always be simulated by another process. This 
is because a response of a non-atomic service to a given 
process 7 might not necessarily be simulated by an- 
other process j without communicating with 2, i.e., 
no set of f + 1 processes can independently simulate 
an f-resilient k-process consensus algorithm without 
communicating with the rest of the system. 


Appendix B_ Complete proofs for 
failure-oblivious services 


Proof of Lemma 6 when failure-oblivious services 
are allowed. 


Lemma 6 Let 7 € J. Let ao and a be finite failure- 
free input-first executions, so and s; the respective 
final states of a9 and a;. Suppose that so and s; are 
j-similar. If ag and a are univalent, then they have 
the same valence. 


Proof: We proceed by contradiction. Without 
loss of generality, assume that all services are failure- 
oblivious. Atomic services can be handled by the same 
argument as used in the proof of Lemma 6 for atomic 
services only. 

Fix j, ao, 1, So, and s; as in the hypotheses of the 
lemma, and suppose (without loss of generality) that 
Qo is 0-valent and a; is 1-valent. Let J C I be any 
set of indices such that j € J and |J| = f+ 1. Since 
f < n-—1 by assumption, we have |J| < n, and so 
I — J is nonempty. 

Consider a fair extension of ag, @o- @, in which 
the first f + 1 actions of @ are fail;, i © J, and no 
other fail actions occur in 3. Note that, for all 2 € J, 
G contains no output actions of P;. Assume that in 
B, no perform, ., compute; ., or bj,- action of any i-* 
task, i € J, occurs at any component c € K U R; we 
may assume this because, for each i € J, action fail, 
enables a dummy action in every task of every service 
and register (* is perform or compute or output). 

Further assume that in 3, no compute, , action of 
any g-compute task occurs at any component c € KU 
R; we may assume this because the occurrence of f+1 
fail; actions enables the dummy — compute, , action in 
every g-compute task of every failure-oblivious service 
C. 

Since Qo is a failure-free input-first execution, the 
resulting extension a - 3 is a fair input-first execution 


containing f + 1 failures. Therefore, the termination 
property for (f + 1)-resilient consensus implies that 
there is a finite prefix of ap - @, which we denote by 
ag: y, that includes decide(v), for some | ¢ J and 
v € {0,1}. Construct ao-7’, where 7 is obtained from 
y by removing the fad, action, all dummy actions, and 
any remaining internal actions of P;, i € J. Thus, 
ag: y' is a failure-free extension of ap that includes 
decide(v),. Since ag is 0-valent, v must be equal to 0. 
We claim that decide(0), occurs in the suffix 4’, 
rather than in the prefix ao. Suppose for contradic- 
tion that the decide(0), action occurs in the prefix ao. 
Then by our technical assumption about processes, the 
decision value 0 is recorded in the state of J. Since so 
and s; are j-similar and | £ 7, the same decision value 
0 appears in the state s;. But this contradicts the as- 
sumption that a@,, which ends in sj, is 1-valent. So, it 
must be that the decide(0), occurs in the suffix y’. 
Now we show how to append essentially the same 
7’ after ay. We know that, for every i € J, y/ con- 
tains no locally controlled action of P;, and contains no 
perform, ., compute, ., or bj, action, for any c € KUR. 


4,0) 4,0) 


By definition of j-similarity and j € J, we have: 


(a) For every i ¢ J, the state of P; is the same in so 
and Si. 


(b) For every ce KUR, 


1. The value of val, is the same in so and s; (that 
is, in the final states of ap and ay). 

2. For every i € J. — J, the value of buffer(7)¢ is 
the same in sg and s1. 


Thus: 


(c) If 7’ contains any locally controlled steps of a pro- 
cess 7, then i ¢ J, and so the state of P; is the 
same in So and s1 


(d) For every cE KUR, 


1. The value of val, is the same in sg and 8}. 


2. For every i € J, if y’ contains any perform, 


4,0) 
compute; ., or output; .. actions, then i ¢ J, and 


so the value of buffer(i)- is the same in so and 
S1. 


Finally, we note that the presence of compute, 
does not invalidate the argument. A compute, . can- 
not refer to or modify any input buffers. The precon- 
dition of compute, . depends only on val., and so the 
same compute, . actions can be applied in +’ after a1, 
and they can add the same items to the output buffers. 
Thus for 1 ¢ J the sequence of values that buffer(i). 
takes along 7 after ap and ¥’ after a, are the same. 
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It follows that it is possible to append “essentially” 
the same 7’ after a1, resulting in a failure-free exten- 
sion of a1 that includes decide(0),.° 

But aj, is 1-valent — a contradiction. 


Proof of Lemma 7 when failure-oblivious services 
are allowed. 


Lemma 7 Let k € K. Let ao and aq, be finite failure- 
free input-first executions, so and s; the respective 
final states of a9 and a;. Suppose that so and s; are 
k-similar. If ap and qa, are univalent, then they have 
the same valence. 


Proof: Fix k, ao, a1, 89, and s; as in the hypotheses 
of the lmma. By contradiction, suppose (without loss 
of generality) that ao is 0-valent and a is 1-valent. 
Let J C I be any set of indices such that |J| = f +1, 
and, if |J,| < f +1, then J, C J, whereas if |Jz| > 
f +1, then J C Jp. 

Consider a fair extension of ag, a9 - 3, in which the 
first f + 1 actions of @ are fail;, 7 © J, and no other 
fail actions occur in @. Note that, for alli € J, 6 
contains no output actions of 7. Assume that in @, no 
perform; , or bi,~ or compute, ,, or compute, , action 
(b € resps,g € glob) of S, occurs; we may assume this 
because the f + 1 fail actions enable dummy actions 
in all tasks of S}. 

Since qo is a failure-free input-first execution, the 
resulting extension ao - (3 is a fair input-first execution 
containing f + 1 fail actions. Therefore, the termi- 
nation property for f + 1-resilient consensus implies 
that there is a finite prefix of ag - 3, which we denote 
by ao -¥, that includes decide(v), for some 1 € I — J 
and v € {0,1}. We know that decide(0), occurs in the 
suffix y, rather than in the prefix ag, by an argument 
similar to that in the proof of Lemma 6. 

Now construct ao -7’, where y’ is obtained from + 
by removing all the fail, actions, 2 € J, and all dummy 
actions. Thus, ao -7’ is a failure-free extension of ag 
that includes decide(v),. Since ag is 0-valent, v must 
be equal to 0. 

Now we show how to append essentially the same 
+ after a. By definition of k-similarity, we have: 


(a) For every i € I, the state of P; is the same in so 
and Si. 


(b) For every c € (K — {k}) UR, the state of S, is the 
same in So and 81. 


Thus: 


SReally, we are appending another execution fragment 7’ 
after a; — one that looks the same to all the processes and 
service tasks that take steps in 7’. 


(c) For every ce K UR, if 7’ contains any perform, 
or bj .¢ or compute; ;, or compute, ;, actions of S¢, 
then the state of S, is the same in so and sj, since 
c # k in this case. 


By properties (a) and (c), it follows that it is possible 
to append “essentially” the same 7’ after a1, (differing 
only in the state of S;) resulting in a failure-free 
extension of a; that includes decide(0),.. But a1 is 
1-valent — a contradiction. 


Proof of Lemma 8 when failure-oblivious services 
are allowed. 


Lemma 8 We establish the same 5 claims as in the 
case of atomic services, which establishes the needed 
contradiction. 

Claims 1, 2, and 5 do not refer to the definition of 
a service, and so their proof remains unchanged from 
the atomic services case. 

The proof of Claim 3 is unchanged, since the only 
actions considered have as participants either a pro- 
cess P;, or P; and a component S,,c€ K UR. Thus, 
whenever S, is a participant, the action must be an 
external action of S¢. 

Since the external actions in the definitions of 
atomic service and failure oblivious service have the 
same effect, namely to add or remiove a single item 
from a single buffer, it follows that the proof of Claim 
3 for the atomic case still applies. 

The proof of Claim 4 is modified as follows. 


Claim 4: There does not exist k € K such that 
Sk € participants(e, s) M participants(e’, s). 
Suppose for contradiction that S;, € 
participants(e,s) M participants(e’, s). There are 
four possibilities: 


1. participants(e, s) = participants(e’,s) = {S,}. 
Then e and e’ must be i— perform or i— compute 
or g — compute tasks of S;, and so involve only 
the state of S;. But then the states sg and 51 
can differ only in the state of S;. So sg and s, 
are k-similar — a contradiction. 


2. For some i € I, participants(e, s) = {S,, P;} and 
participants(e’, s) = {Sz}. 
Hence action(e,s) is either a; or bj,, and 
action(e’, s) is one of perform, ;,, compute; ,, OF 
compute, ,, where j € Jz, g € glob. 
Inspection of the definition of a failure-oblivious 


service shows that the two tasks commute, that 
is, e’(so) = $1 — a contradiction. 
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3. For some i € I, participants(e’,s) = {Sx, Pi} 


and participants(e,s) = {S;}. 


Hence action(e,s) is one of perform, ,,, 
compute; ,, or compute, ;,, where 7 € Jk,g € 
glob, and action(e’,s) is either a;,x or b;,x. 


Inspection of the definition of a failure-oblivious 
service shows that the two tasks commute, that 
is, e’(59) = $1 — a contradiction. 


. For some i,j € I, partictpants(e,s) = {Sx, P;} 


and participants(e’,s) = {Sx, Pj}. 

By Claim 3, we know that i # 7. Now 
action(e, s) is either a;,x or bj,,, and action(e’, s) 
is either aj;,~ or b;,n. 

Inspection of the definition of a failure-oblivious 
service shows that the two tasks commute, that 
is, e’(59) = $; — a contradiction. 


